Learning from Imbalanced Multiclass Sequential Data Streams Using Dynamically Weighted Conditional Random Fields
نویسندگان
چکیده
The present study introduces a method for improving the classification performance of imbalanced multiclass data streams from wireless body worn sensors. Data imbalance is an inherent problem in activity recognition caused by the irregular time distribution of activities, which are sequential and dependent on previous movements. We use conditional random fields (CRF), a graphical model for structured classification, to take advantage of dependencies between activities in a sequence. However, CRFs do not consider the negative effects of class imbalance during training. We propose a class-wise dynamically weighted CRF (dWCRF) where weights are automatically determined during training by maximizing the expected overall F-score. Our results based on three case studies from a healthcare application using a batteryless body worn sensor, demonstrate that our method, in general, improves overall and minority class F-score when compared to other CRF based classifiers and achieves similar or better overall and class-wise performance when compared to SVM based classifiers under conditions of limited training data. We also confirm the performance of our approach using an additional battery powered body worn sensor dataset, achieving similar results in cases of high class imbalance.
منابع مشابه
Cost Sensitive Online Multiple Kernel Classification
Learning from data streams has been an important open research problem in the era of big data analytics. This paper investigates supervised machine learning techniques for mining data streams with application to online anomaly detection. Unlike conventional machine learning tasks, machine learning from data streams for online anomaly detection has several challenges: (i) data arriving sequentia...
متن کاملPredicting Dialogue Acts for Intelligent Virtual Agents with Multimodal Student Interaction Data
Recent years have seen a growing interest in intelligent gamebased learning environments featuring virtual agents. A key challenge posed by incorporating virtual agents in game-based learning environments is dynamically determining the dialogue moves they should make in order to best support students’ problem solving. This paper presents a data-driven modeling approach that uses a Wizard-of-Oz ...
متن کاملDynamic Weighted Majority for Incremental Learning of Imbalanced Data Streams with Concept Drift
Concept drifts occurring in data streams will jeopardize the accuracy and stability of the online learning process. If the data stream is imbalanced, it will be even more challenging to detect and cure the concept drift. In the literature, these two problems have been intensively addressed separately, but have yet to be well studied when they occur together. In this paper, we propose a chunk-ba...
متن کاملScalable Large-Margin Online Learning for Structured Classification
We investigate large-margin online learning algorithms for large-scale structured classification tasks, focusing on a structured-output extension of MIRA, the multi-class classification algorithm of Crammer and Singer [5]. The extension approximates the parameter updates in MIRA using k-best structural decoding. We evaluate the algorithm on several sequential classification tasks, showing that ...
متن کاملGeneralized Stacked Sequential Learning
In many supervised learning problems, it is assumed that data is independent and identically distributed. This assumption does not hold true in many real cases, where a neighboring pair of examples and their labels exhibit some kind of relationship. Sequential learning algorithms take benefit of these relationships in order to improve generalization. In the literature, there are different appro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1603.03627 شماره
صفحات -
تاریخ انتشار 2016